Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🎮 Reinforcement Learning
Q-Learning, Agents, Rewards, Game Theory
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
112812
posts in
297.4
ms
Show HN:
Fighting
the War Against
Expensive
Reinforcement Learning
cadenza-landing-qtu7gbjwb-akshparekh123-3457s-projects.vercel.app
·
18h
·
Discuss:
Hacker News
🤖
AI
Blockwise
Advantage Estimation for Multi-Objective RL with
Verifiable
Rewards
arxiv.org
·
20h
🤖
AI
Mitigating
Reward Hacking in
RLHF
via Bayesian Non-negative Reward Modeling
arxiv.org
·
20h
🧠
Machine Learning
A multi-agent reinforcement learning approach to autonomous aircraft
taxiing
with
taxiing
time, fuel consumption, and
emission
optimization
sciencedirect.com
·
1d
🤖
AI
check out this
article
on Reinforcement Learning with R:
Origins
, Real-Life Applications, and Practical Implementation
dev.to
·
2d
·
Discuss:
DEV
🤖
AI
Optimal
timing
for
superintelligence
marginalrevolution.com
·
1h
🤖
AI
Truth and paradox in the theory of finite and infinite games,
Owens
Memorial
Lecture
, Wayne State University, April 2026
jdh.hamkins.org
·
22h
λ
Functional Programming
BetaZero
V2: A Diffusion Model for Setting
Boulder
Problems
evmojo37.substack.com
·
2h
·
Discuss:
Substack
📊
Data Science
A
Conceptual
Framework for Exploration
Hacking
lesswrong.com
·
9h
λ
Functional Programming
How to
Leverage
Explainable
AI for Better Business Decisions
towardsdatascience.com
·
10h
🤖
AI
Feedback
Control for Computer Systems
janert.org
·
17h
🤖
AI
Optimizing post-disaster road
restoration
with reinforcement learning: A
traveler-behavior-aware
approach
sciencedirect.com
·
9h
🤖
AI
Artificial Intelligence and the
Passivity
Problem
psychologytoday.com
·
7h
🤖
AI
Observe
emergent
behavior in autonomous multi-agent LLM networks
agents.glide2.app
·
2d
·
Discuss:
Hacker News
🤖
AI
Entropic
Balance with Feedback Control: Information
Equalities
and Tight Inequalities
link.aps.org
·
2d
🤖
AI
v6 (Code 2 here) — Most complete architecture. This version is faster than my old v5,
statistically
correct, has all the advanced psychology/network features, and produces stunning
visualizations
gist.github.com
·
6h
·
Discuss:
r/C_Programming
📊
Data Science
Show HN: A
minimal
online decision maker
decisionmaker.online
·
1d
·
Discuss:
Hacker News
🤖
AI
For real
game-theoretic
reasoning, we need best response in
imperfect
information games
weyxie.bearblog.dev
·
3d
·
Discuss:
Hacker News
🤖
AI
New technology in
programming
and
poker
natemeyvis.com
·
3h
🤖
AI
Show HN: I
taught
AI to remember. Then it
warned
me
github.com
·
48m
·
Discuss:
Hacker News
🤖
AI
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help